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1 Introduction 

Every lazy functional programmer knows about the following approach to enumerating the 
positive rationals: generate a two-dimensional matrix (an infinite list of infinite lists), then 
traverse its finite diagonals (an infinite list of finite lists). Each row of the matrix has the 
positive rationals with a given denominator, and each column those with a given numerator: 

Vi 2 /i 3 A-7i- 

1/ 2/ 3/ ...m/ ... 

In In In In 

Since each row is infinite, the rows cannot simply be concatenated. However, each of the di- 
agonals from upper right to lower left, containing rationals with numerator and denominator 
of a given sum, is finite, so these can be concatenated: 

rats i :: [Rational] 

rats\ = concat (diags {{ m / n \ m <— [ 1 . . ] ] | n <— [ 1 . . ] ] ) 
diags — diags' [ ] 

where diags' xss (ys : yss) = map head xss : diags' (ys : map tail xss) yss 
Equivalently, one can deforest the matrix altogether, and generate the diagonals directly: 
ratsi :: [Rational] 

rats 2 = concat [[ m /d- m \m <- [1 . .d - 1]] | d <- [2..]] 

All very well, but the resulting enumeration of the positive rationals contains duplicates — 
in fact, infinitely many duplicates of every rational. 

One could enumerate the rationals without duplication indirectly, by filtering the co- 
prime pairs from those generated as above. In this paper, however, we explain an elegant 
technique for enumerating the positive rationals directly, without duplicates. Moreover, 
we show how to do so as a simple iteration, generating each element of the enumeration 
from the previous one alone, with constant cost (in terms of number of arbitrary-precision 
simple arithmetic operations) per element. Best of all, the resulting programs are extremely 
simple — simpler even than the two programs above. The mathematical results are not 
new (Calkin & Wilf, 2000; Newman, 2003); however, we believe that they deserve wider 
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appreciation in the functional programming community. Besides, the exercise provides 
some compelling examples of unfolds on infinite trees. 

2 Greatest common divisor 

The diagonalization approach to enumerating the rationals is based on generating the pairs 
of positive integers. The essence of the problem with this approach is that the natural 
correspondence via division between integer pairs and rationals is not a bijection: although 
every rational is represented, many integer pairs represent the same rational. Obviously, 
therefore, enumerating the rationals by generating the integer pairs yields duplicates. 

Equally obviously, a solution to the problem can be obtained by finding a simple-to- 
enumerate set with a simple-to-compute bijection to the rationals. Both constraints on 
simplicity are necessary. The naturals are simple to enumerate, and there clearly exists a 
bijection between the naturals and the rationals; but this bijection is not simple to compute. 
On the other hand, there is a simple bijection from the rationals to themselves, but that still 
begs the question of how to enumerate the rationals. 

The crucial insight is the relationship between rationals and greatest common divisors. 
Recall Euclid's subtractive algorithm for computing greatest common divisor: 

gcd :: (Integer, Integer) — > Integer 

gcd (m,n) = if m < « then gcd (m,n — m) else 
\im>n then gcd (m—n,n) else m 

Consider the following 'instrumented version', that returns not only the greatest common 
divisor, but also a trace of the execution by which it is computed: 

igcd :: (Integer, Integer) — > (Integer, [Bool]) 

igcd (m,n) — if m<n then step False (igcd (m,n —m)) else 

if m > n then step True (igcd (m — n,n)) else(m,[]) 
where step b (d, bs) = (d,b: bs) 

Given a pair (m,n), the function igcd returns a pair (d,bs), where d is gcd (m,n) and bs is 
the list of booleans recording the 'execution path' — that is, a list of the branches taken — 
when evaluating gcd (m,n). Let us introduce the function pgcd, so that bs — pgcd (m,n). 
These two pieces of data together are sufficient to invert the computation and reconstruct 
m and n — that is, given: 

ungcd:: (Integer, [Bool]) — > (Integer, Integer) 
ungcd (d, bs) = foldr undo (d, d) bs 

where undo False (m,n) — (m,n + m) 
undo True (m,n) — (m + n,n) 

then ungcd and igcd are each other's inverses, and so there is a bijection between integer 
pairs (m,n) and their images (d,bs) under igcd. 

Now, gcd (m,n) is exactly what is superfluous in the mapping from (m,n) to the rational 
m /„, and pgcd (m,n) is exactly what is relevant in this mapping, since two pairs (m,n) and 
(m',n') represent the same rational iff they have the same pgcd: 

7« = m '/n' Pgcd (m,n)= pgcd (m! ,n') 




Fig. 1 . The first few levels of the Stern-Brocot tree. 



Moreover, pgcd is surjective: every finite boolean sequence is the pgcd of some pair. The 
function ungcd gives a constructive proof of this, by reconstructing such pairs. Therefore we 
can enumerate the rationals by enumerating the finite boolean sequences: the enumeration 
is easy enough, and the bijection to the rationals is simple to compute, via ungcd: 

rats?, :: [Rational] 

rats?, = map (mkRat ° curry ungcd 1 ) boolseqs 

boolseqs = [ ] : [b : bs | bs <— boolseqs, b <— [False, True ] ] 
mkRat (m,n) = m / n 



3 The Stern-Brocot tree 

A standard way of representing a mapping from finite strings over some alphabet is with a 
trie: a tree of degree equal to the size of the alphabet, in which the paths form the (prefixes 
of all the) strings in the domain of the mapping, and the image of every string is located in 
the tree at the end of the corresponding path (Knuth, 1998; Thue, 1912). In this case, the 
alphabet is binary, with the two symbols False and True, so the tree is binary too; and every 
finite string is in the domain of the mapping, so every node of the tree is the location of some 
rational. The first few levels are shown in Figure 1 (the significance of the two pseudo-nodes 
labelled °/i and % will be made clear shortly). For example, pgcd (3,4) is [False, True, True], 
so the rational % appears at the end of the path [L, R,R] , that is, as the rightmost grandchild 
of the left child of the root; the root is labelled since (1,1) yields the empty execution 
path. This tree turns out to be well-known; Graham, Knuth and Patashnik (1994, §4.5) call 
it the Stern-Brocot tree, after its two independent nineteenth-century discoverers. It enjoys 
the following two properties, among many others: 

• The tree is an infinite binary search tree, so any finite pruning has an increasing 
inorder traversal. 

For example, pruning to include the level with l /$ and 3 /i but nothing deeper yields a tree 
with inorder traversal l fo , l fe , 2 /3 , l /i , 3 /2 , 2 /i > 3 /i > which is increasing. 

• Every node is labelled with a rational m+m '/„+„>, the 'intermediary' of m /„, the label of 
its rightmost left ancestor, and m / n >, that of its leftmost right ancestor. 
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For example, the node labelled % has ancestors % , '/j, '/i , °/i , Vo. of which l /\ and % are to 
the right and the others to the left. The rightmost left ancestor is and the leftmost right 
ancestor and indeed % = 2+1 /3+i. That is why we included the two pseudo-nodes °/i 
and '/o in Figure 1 : they are needed to make this relationship work for nodes like l fe and 3 /i 
on the boundary of the tree proper. 

The latter property explains how to generate the tree directly, dispensing with the se- 
quences of booleans. The seed from which the tree is grown consists of its rightmost left 
and leftmost right ancestors, initially the two pseudo-nodes. The tree root is their interme- 
diary, which then acts as one half of the seed for each subtree. 

data Tree a = Node (a, Tree a, Tree a) 

foldtf (Node (a,x,y)) =f (a, f oldt f x, foldtf y) 

unfoldtf x = let (a,y,z) —f xinNode (a, unfoldtf y, unfoldt f z) 

rats^ :: [Rational] 

rats\ = bf (unfoldt step ((0, 1), (1,0))) 

where step (I, r) = let m = adj I r in 

(mkRat m,(l,m),(m,r)) 

adj (m,n) (m',n') = (m + m',n+n') 
bf = concat ° foldt glue 

where glue (a,xs,ys) = [a] -.zipWith (+f ) xsys 

Alternatively, one could deforest the tree itself and generate the levels directly. Start with the 
first level, consisting of the two pseudo-nodes, and repeatedly insert new nodes m+m / n + n ' 
between each existing adjacent pair m / n , m / n '. 

ratss :: [Rational] 

ratss = concat (unfolds infill [ (0, 1 ) , ( 1 , 0) ] ) 

unfolds fa = let (b,a') =f a in b : unfolds f a' 

infill xs = (map mkRat ys, interleave xs ys) 

where ys — zipWith adj xs (tailxs) 
interleave (x :xs)ys—x: interleave ys xs 
interleave [] [] =[] 

An additional interesting property of the Stern-Brocot tree is that it forms the basis for 
a number representation system (credited by Graham, Knuth and Patashnik to Minkowski 
in 1904, exactly a century ago at the time of writing). Every rational is represented by the 
unique finite boolean sequence recording the path to it in the tree. An irrational number is 
represented by the unique infinite boolean sequence that converges on where it belongs; for 
example, 5 /2<e< 3 /i, so e has a representation starting [True, True 7 False, True, ...]. 



4 The Calkin- Wilf tree 

The Stern-Brocot tree is the trie of the mapping from boolean sequences pgcd (m,n) to 
rationals ™/„. But since all boolean sequences appear in the domain of this mapping (the 
tree is complete), so do their reverses, and we might just as well build the mapping from 
the reverse of pgcd (m,n) to the same rational '"/„. We call this tree the Calkin-Wilf tree, 
after its two explorers (Calkin & Wilf, 2000), whose work is promoted as one of Aigner and 




Ziegler's Proofs from The Book (2004, Chapter 16). The first few levels of the Calkin-Wilf 
tree are shown in Figure 2. 

Whereas in the Stern-Brocot tree the path from the root to a node '% records the trace 
of the computation of gcd (m,n), in the Calkin-Wilf tree it is the path to the root from that 
node that records the trace. One might argue that this orientation is more natural. 

Of course, a given level k of the Calkin-Wilf tree and of the Stern-Brocot tree contain the 
same collection of rationals (namely, those on which Euclid's subtractive algorithm takes k 
steps); but the two collections are generally in a different order: the Calkin-Wilf tree is not 
a binary search tree. 

In fact, each level of the Calkin-Wilf tree is the bit-reversal permutation (Hinze, 2000; 
Bird et ai, 1999) of the corresponding level of the Stern-Brocot tree. For example, if the 
elements of the lowest level shown in Figure 1 are numbered in binary 000 to 1 1 1 from left 
to right, they appear in Figure 2 in the order 000, 100,010, 110,001, 101,011, 111, which 
are the reversals of the binary numbers 000 to 111. Bit-reversal of the levels arises naturally 
from reversal of the paths. 

The binary search tree property of the Stern-Brocot tree is appealing, so it is a shame 
to lose it. However, the loss has its compensations. For one thing, indexing the tree by 
the reverses of the execution paths means that executions with common endings, rather 
than common beginnings, are grouped together. A consequence of this is that the ancestors 
in the Calkin-Wilf tree of a rational m /„ record all the states that Euclid's algorithm visits 
when starting at the pair (m, n) . For example, one execution path of Euclid's algorithm is the 
sequence of pairs (3 , 4) , (3 , 1 ) , (2, 1 ) , ( 1 , 1 ), and indeed the ancestors in the Calkin-Wilf tree 
of % are 3 /i , 2 /i , l /\ . (Compare this with the Stern-Brocot tree, in which there is no obvious 
relationship between parents and children.) Thus, a rational m /„ with m < n is the left child 
of the rational '%_„„ whereas if m > n it is the right child of m ~%. Equivalently, a rational 
m /„ has left child m / m+n and right child n+m / n . This shows how to generate the Calkin-Wilf 
tree: 

ratsd :: [Rational] 

ratse — bf (unfoldt step (1,1)) 

where step (m,n) = ( m / n , (m,m + n) , (n + m,n)) 
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k — lk+l-x 
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(a) 



x — 1 = X\ 



X[ = l kk-x 

\ / 

X=X() x' Q = l / 2 k+l-x 

(b) 



Fig. 3. The neighbours (a) and successor (b) of an element x in the Calkin- Wilf tree. 



5 Iterating through the rationals 

However, there is an even better compensation for the loss of the ordering property in 
moving from the Stern-Brocot to the Calkin-Wilf tree: it becomes possible to deforest the 
tree altogether, and generate the rationals directly, maintaining no additional state beyond 
the 'current' rational. This startling observation is due to Moshe Newman (Newman, 2003). 
In contrast, it is not at all obvious how to do this for the Stern-Brocot tree; the best we can 
do seems to be to deforest the tree as far as its levels, but this still entails additional state of 
increasing size. 

We will generate the rationals using the iterate operator, computing each from the previous 
one. 

iterate :: (a — > a) — > a — > [a] 
iterate f x = x : iterate f (f x) 

It is clear how to do this in some cases; for example, if '% is a left child, then m <n, the 
parent is m / n - m , and the successor is the right child of the parent, namely %- m . In terms 
of x = '% < 1, the parent is 1 / ( / x — 1), and the successor is the right child of this, or 
1 + 1 / (% — 1) = Vi-x- (The relationship between a node and its possible neighbours is 
illustrated in Figure 3(a).) 

More generally, x and its successor x' have a more distant ancestor in common. This 
situation is illustrated in Figure 3(b). Here, xq = x is a right child of a parent x\ = x — 1, 
itself the right child of x 2 = x\ — 1 = x — 2, and so on up to x^ = x — k, which is a left child. 
Therefore x/ c <l, and so k = [xj, the integer part of x. Element x^ is the left child of the 
common ancestor >■ = 1 / — 1), whose right child is x' k — l /i-( x -k) — l /k+i-x- Element 
x' k has left child x' k _ x = 1 / 1 / x ' k +i = %+2-x, which has left child x' k 2 = 1 /k+3- x , and so on 
down to x' = Xq = 1 /2xk+i-x = 1 /[x\ + i-{x} (where {x} = x — [xJ i s trie fractional part of x), 
which is the successor of x. 

The formula x' = 1 /[ x \+i-{x} for the successor of x even works in the last remaining case, 
when x is on the right boundary and x' on the left boundary one level lower: then x is an 
integer, so [xJ = x and {x} = 0, and indeed x' = l /[ x \ + i-{x}- This motivates the following 
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enumeration of the rationals: 

ratsi :: [Rational] 
rats-] = iterate next 1 

nextx = recip (fromIntegern + l—y) where (n,y) = properFractionx 

Each term is generated from its predecessor with a constant number of rational arithmetic 
operations. (The Haskell standard library functions properFraction and recip take x to 
( W > W) an d 7*> respectively.) 

Could there be any simpler way to enumerate the positive rationals? 

Calkin and Wilf (Calkin & Wilf, 2000) discuss some additional properties of this enumer- 
ation. It is not hard to show that the numerator of the successor next x of a rational x is the 
denominator of x, so in fact the sequence of numerators 1,1, 2, 1,3, 2, 3... determines the 
sequence of rationals. This sequence is actually the solution to a natural counting problem: 
the z'th element, starting from zero, counts the number of ways to write i in a redundant 
binary representation in which each digit may be 0, 1 or 2. For example, the fourth element 
is 3, and indeed there are three such ways of writing 4, namely 100, 20 and 12. Dijkstra 
also explored this sequence (Dijkstra, 1982a; Dijkstra, 1982b), which he called fuse; he 
showed, among other things, that fuse n = fuse n' where n' is the bit-reversal of n — another 
connection with bit-reversal permutations. 

Of course, it is not difficult to generate all the rationals, zero and negative as well as 
positive, in the same way — zero is a special initial case, and after that the positive rationals 
alternate with their negations: 

rats% :: [Rational] 
rats% — iterate next 1 0 

where next! 0 =1 

next 1 x | x > 0 = negate x 

| otherwise — next (negate x) 



6 The continued fraction connection 

Some additional insights into these algorithms for enumerating the rationals may be obtained 
by considering the continued fraction representation of the rationals. We write the finite 
continued fraction: 

floH — j 



as the sequence of integer coefficients [flo,fli, • • • ,a„]. For example, % is 0+ 1 / (1 + so 
is represented by [0, 1,3]. Every rational has a unique normal form as a regular continued 
fraction; that is, as a finite sequence [ao,a\ , . . . ,a n ] under the constraints that a, > 0 for i > 0 
and that a n > 1 if n > 0. Figure 4 shows the first few levels of the Calkin-Wilf tree with 
rationals expressed as continued fractions. 

We have shown that the positive rationals are the iterates of the function taking x to 
Vw+i— M> wnose computation requires a constant number of arithmetic operations on 
rationals. Division is required in order to compute [x\ . However, if we represent rationals 
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[0,4] [1,3] [0,1,1,2] [2,2] [0,2,2] [1,1,2] [0,1,3] [4] 
Fig. 4. The first few levels of the Calkin- Wilf tree, as continued fractions. 



by regular continued fractions, then this division can be avoided: the integer part of a rational 
is simply the first term of the continued fraction. In fact, most of the required operations 
are easy to implement: the fractional part is obtained by setting the first term to zero, 
incrementing is a matter of incrementing the first term, and reciprocating either removes a 
leading zero (if present) or prefixes a leading zero (if not). Only negation is not so obvious. 
However, it turns out that a straightforward case analysis suffices, as the reader may check: 

negatecf [no] = [—no] 

negatecf [no, 2] = [ — no — 1,2] 

negatecf (no : 1 : «2 : ns) = (—no — 1) : (m + 1) : ns 

negatecf (no: n\ :ns) = (—no — 1): 1 : (n\ — \):ns 

Given this implementation of negation, it is straightforward to derive the following data 
refinement of ratsj. That is, if c is the continued fraction representation of rational x, then 
nextcf c is the continued fraction representation of [xj + 1 — {x}. 

type CF = [Integer] 
rats*) :: [CF] 

rats<) = iterate (recipcf ° nextcf) [1] 
where nextcf [no] = [no + 1 ] 

nextcf [«0)2] = [no, 2] 

nextcf (no : 1 : «2 : ns) = no : («2 + 1 ) : ns 
nextcf (no'.n\\ns) = no : 1 : («i — 1 ) : ns 
recipcf (0 : ns) = ns 

recipcf ns = 0 : ns 

For example, consider the third clause for nextcf. If x is represented by c = «o : 1 : "2 : ns, 
then [xj = no, and {x} is represented by 0 : 1 : «2 : ns; this negates to (— 1) : («2 + 1) : ns, 
which when increased by no + 1 yields no : («2 + 1 ) : ns. 

This uses a constant number of arbitrary-precision integer additions and subtractions per 
term, but no divisions or multiplications. Of course, the result will be a list of continued 
fractions. These can be converted to rationals with the following function: 

cflraf:. CF — > Rational 
cflrat = mkRat ofoldr op (1,0) 

where op m (n,d) — (mxn +d,n) 
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This uses additions and multiplications linear in the size of the continued fraction, but again 
no divisions (because coprimality of the pairs (n,d) is invariant under op m). 

An additional thing that strikes the observer here is that the coefficients of the continued 
fractions on every level of the Calkin- Wilf tree sum to the same value, which is also the 
depth of that level. This is easy to justify when one considers the translation of Figure 3 
to continued fractions: an element x has right child x + 1 (and incrementing a continued 
fraction is a matter of incrementing the first term, and hence incrementing the sum) and left 
child 1 / (%+ 1) (and reciprocating a continued fraction is a matter of either prefixing or 
removing a leading zero, neither of which changes the sum). As a corollary, note that there 
are exactly 2 k ~ l regular positive continued fractions that sum to k. 

Graham, Knuth and Patashnik (1994, §6.7) present a connection between the continued- 
fraction Stern-Brocot tree and Euclid's algorithm; we translate their observations here to 
the Calkin-Wilf tree. They show that the path to an element x in the tree is directly related 
to the continued fraction of x: if the path to x is L a " l R a «- x L a «- 1 ■ ■ -R a °, then x is represented 
by the continued fraction [ao,a\,...,a n + l] (which is not regular if a n = 0, but normalizes 
then to [00, a\ , . . . , a n -\ + 1 ]). For example, the rational % appears at the end of the path 
L°R 2 L l R°, so has the continued fraction representation [0,1,2,0 + 1], which normalizes to 
[0, 1,3] as expected. 

This view of paths, in which consecutive steps in the same direction are grouped to- 
gether, conforms to the usual presentation of Euclid's algorithm using division instead of 
subtraction: 

gcd :: (Integer, Integer) — > Integer 

gcd (m,ri) = if m <n then gcd (m,nmodm) else 

if m>n then gcd (m mod n,n) else m 
Each modulus computation casts out a certain number of multiples of the modulus, which 
corresponds in the Calkin-Wilf tree to a certain number of consecutive steps in the same 
direction. Graham, Knuth and Patashnik's observation therefore demonstrates a connection 
between the number of terms in the continued fraction representation of m / n and the number 
of steps taken to compute gcd (m,n) by Euclid's division-based algorithm. 
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sb = zipW mkRat (t, u) 

where t =Node (l,t,zipW (uncurry (+)) (t,u)) 
u = mirror t 

of the Stern-Brocot tree, where 

zipWf — unfoldt (apply f) 

where apply f (Node (a,t,u),Node (b,v,w)) = (f (a,b),(t,v), (u,w)) 
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and 

mirror —foldt switch where switch (a,t, u) = Node (a, u, t) 

That is, the denominator tree is the mirror image of the numerator tree; the numerator tree 
has 1 at the root, itself as its left child, and the element-wise sum of the numerator and 
denominator trees as its right child. 

Boyko Bantchev and Cristian Calude brought to our attention work by D. N. Andreev 
(n.d.) and Shen Yu-Ting (1980), respectively. They define yet another enumeration of the 
positive rationals; although neither mentions trees, they describe in effect the construction 

rats\o :: [Rational] 

ratsio = bf (unfoldt step (1,1)) 

where step (m,n) = ( m / n ,(n + m,n),(n,n + m)) 

The elements on each level are the same as in the Stern-Brocot and Calkin- Wilf trees, but 
a different order again; like the Stern-Brocot tree, this tree also does not give rise to an 
iterative enumeration of the rationals. 

We would never have embarked upon this problem at all without the inspiration of Aigner 
and Ziegler's beautiful book (Aigner & Ziegler, 2004), promoting, among others, the elegant 
work of Calkin and Wilf (Calkin & Wilf, 2000) and Newman (Newman, 2003). The code 
is formatted with Andres Loh's and Ralf Hinze's wonderful lhs2T£X. 
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